Generalized envelope matching technique for time-scale modification of speech (GEM-TSM)
نویسنده
چکیده
A new time-domain, non-pitch-synchronous method for timescale modification targeted on broadband speech is proposed. The method is based on the SOLA (synchronous overlap-add) and EM-TSM (envelope-matching time-scale modification) methods, where the sign envelope of the EM-TSM method is replaced by a generalized envelope formed by the highest bits of the samples. (The actual number of bits will depend on word length constraints of the specific hardware.) In addition, a fixed length scheme for calculating cross-correlation is proposed, eliminating the need for normalization after computing each cross-correlation value. With these improvements, the proposed method outperforms EM-TSM both in terms of output quality and computational efficiency.
منابع مشابه
Fast Time Scale Modification Using Envelope-Matching Technique (EM-TSM)
We propose a technique called Envelop-Matching for Time Scale Modification (EM-TSM). This technique is the modification of a technique called synchronized overlap-and-add (SOLA) [5] with the computation complexity significantly reduced. The reduction in computation complexity is useful the fast browsing of audio or video, which can be implemented by a general single-processor machine.
متن کاملFast SOLA-Based Time Scale Modification Using Envelope Matching
Time scale modification (TSM) of speech and audio signals is very useful in many applications such as MPEG-4 and fast/slow browsing of pre-recorded materials. Synchronized Overlap-and-Add (SOLA) is a time-domain TSM algorithm known to achieve good speech and audio quality. One problem of SOLA is that it requires a large amount of computation in the search of the best matching point between the ...
متن کاملVLSI implementation of a TSM/FSM algorithm
The time scale modification (TSM) of speech is concerned with the compressing or expanding of audio signals in the time domain without affecting the signals pitch or naturalness. Conversely, the frequency scale modification (FSM) of speech is concerned with altering the pitch and formants of a signal without changing the signal duration. This paper describes a hardware implemented and optimized...
متن کاملA novel high quality efficient algorithm for time-scale modification of speech
We present a novel efficient algorithm for time-scale modification (TSM) of speech which gives output quality equal to that of a conventional TSM algorithm, but having computational load an order of magnitude less. The algorithm presented uses a fixed length rectangular stepping window and a simple peak alignment criterion to track the local natural scaling factor and adapt the window step size...
متن کاملUsing Audio Time Scale Modification for Video Browsing
In the IBM CueVideo project we study various aspects of fully automated video indexing, browsing and retrieval. The technical aspects include audio processing, speech recognition, image processing and information retrieval. Equally important, however, is exploring user expectations and conducting user studies. We focus on the field of video for Training and Education, including Distributed Lear...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005